Skip to content

repeat_row -- disallow on WITH ORDINALITY; add repeat_row_non_negative#36174

Open
ggevay wants to merge 3 commits intoMaterializeInc:mainfrom
ggevay:repeat_row-1-with-ord
Open

repeat_row -- disallow on WITH ORDINALITY; add repeat_row_non_negative#36174
ggevay wants to merge 3 commits intoMaterializeInc:mainfrom
ggevay:repeat_row-1-with-ord

Conversation

@ggevay
Copy link
Copy Markdown
Contributor

@ggevay ggevay commented Apr 21, 2026

This PR kicks off the repeat_row productionization work stream. Resolves SQL-160. Resolves SQL-161.

I suggest reviewing commit by commit:

  1. just some renamings to match up internal names to the SQL name
  2. Add repeat_row_non_negative as a new SQL function, which errors out on a negative input (instead of producing a negative number of rows). It's less powerful than repeat_row, but its advantage is that it doesn't have the restrictions and dangers that repeat_row has. The immediate motivation for this was to be able to keep using this in tests of WITH ORDINALITY when repeat_row is disallowed, but it might also be useful for users for other purposes, so users could be allowed to freely use it. (For now, it's guarded by a feature flag which defaults to off, and I haven't documented it yet, but if we think this would be indeed a useful function for users, then I can add docs and turn it on in a follow-up PR.)
  3. The main point of the PR: disallows repeat_row in WITH ORDINALITY and in some other constructs that are implemented using WITH ORDINALITY. We need to make this change for productionizing repeat_row, because WITH ORDINALITY is not well-defined for negative diffs, and the implementation panics on a negative diff. (Note that this is not violating backwards compatibility requirements, because repeat_row was not officially supported so far, and has not been turned on for most users.)

Note: An alternative to disallowing repeat_row with WITH ORDINALITY would be to make WithOrdinality::eval not panic on negative diffs. However, this seems to be not easily possible without a performance hit or complicating the code: We'd need to either

  • eval the table function twice, only checking the diffs at the first evaluation, costing CPU;
  • save the eval result in a Vec, costing memory;
  • make the flat_map closure somehow fallible, for which I don't see an easy way. (E.g., we could make the iterator's item type Result, but then would need to touch every caller of TableFunc::eval and every arm of the big match in TableFunc::eval, and might slow things down a bit. Or we could report the error in an Rc<Cell<...>>, which also feels convoluted, and potentially costing CPU.)

We could reconsider making it possible generally to return errors in the middle of table function iterators, if we ever find a need for this in some other table function. But repeat_row together with WITH ORDINALITY feels so rare, that it seems ok to just disallow this combination for now. Edit: https://github.com/MaterializeInc/database-issues/issues/11324 would be one more reason to implement error returns inside this iterator, but that issue seems also quite low priority.

@ggevay ggevay added the A-ADAPTER Topics related to the ADAPTER layer label Apr 21, 2026
@ggevay ggevay requested review from a team as code owners April 21, 2026 10:28
@ggevay ggevay added the A-CLUSTER Topics related to the CLUSTER layer label Apr 21, 2026
@ggevay ggevay requested a review from SangJunBak April 21, 2026 10:28
ggevay added 2 commits April 21, 2026 12:55
Also disallow it in constructs that are implemented by WITH ORDINALITY
under the hood:
- ROWS FROM
- implicit ROWS FROM, i.e., multiple table functions in a SELECT clause
@ggevay ggevay force-pushed the repeat_row-1-with-ord branch from 3ede8a8 to b441fcb Compare April 21, 2026 10:56
Copy link
Copy Markdown
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT * FROM repeat_row_non_negative(9223372036854775807) WITH ORDINALITY LIMIT 1;

Returns wrong result:

 ordinality
------------
(0 rows)

Should be 1.

@ggevay
Copy link
Copy Markdown
Contributor Author

ggevay commented Apr 21, 2026

@def-, good catch! The bug is in WITH ORDINALITY, and is also present on main with

SELECT * FROM repeat_row(9223372036854775807) WITH ORDINALITY LIMIT 1;

It's coming from inside WithOrdinality::eval, where next_ordinal + diff overflows. Unfortunately, there is no easy way to report an error from inside that iterator at the moment (as explained in the last part of the PR description). So, fixing this would need a bigger refactoring, so I'll create a separate issue for it. Edit: https://github.com/MaterializeInc/database-issues/issues/11324

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ADAPTER Topics related to the ADAPTER layer A-CLUSTER Topics related to the CLUSTER layer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants